Implementation of Parallel Collection Equi-Join Using MPI
نویسندگان
چکیده
One of the collection joins types in Object Oriented Database (OODB) is collection equi-join. The main feature of collection joins is that they involve collection types. In this paper we present our experience in implementing collection equi-join algorithms by using Message Passing Interface (MPI). In particular, it layouts the fundamental techniques that are used in the implementation and that may be applicable to other collection joins. Two collection equi-joins discussed here are Double Sortmerge and Sort Hash Join. The implementation was done on a clustered environment and employed a data parallelism concept.
منابع مشابه
Parallel Collection Equi-Join Algorithms for Object-Oriented Databases
One of the differences between relational and objectoriented databases (OODB) is that attributes in OODB can be of a collection type (e.g. sets, lists, arrays, bags) as well as a simple type (e.g. integer, string). Consequently, explicit join queries in OODB may be based on collection attributes. One form of collection join queries in OODB is “collection-equi join queries”, where the joins are ...
متن کاملAn Efficient Parallel Algorithm for High Dimensional Similarity Join - Parallel Processing Symposium, 1998, and Symposium on Parallel and Distributed Processing 1998. 19
Multidimensional similarity join finds pairs of multidimensional points that are within some small distance of each other: The 6-k-d-B tree has been proposed as a data structure that scales better as the number of dimensions increases compared to previous data structures. We present a cost model of the E-k-d-B tree and use it to optimize the leaf size. We present novel parallel algorithms for t...
متن کاملParallel Sub-Collection Join Algorithm for High Performance Object-Oriented Databases
In Object-Oriented Databases (OODB), although path expression between classes may exist, it is sometimes necessary to perform an explicit join between two or more classes due to the absence of pointer connections or the need for value matching between objects. Furthermore, since objects are not in a normal form, an attribute of a class may have a collection as a domain. Collection attributes ar...
متن کاملAn evaluation of the performance of parallel database operators using Phoenix MapReduce
The database join operator is the most expensive operator of the relational algebra operators. Many highly efficient sequential and parallel operators exist, based on several core techniques: sort-merge, hash and nested-loops. We present the design and implementation of two parallel operators: an equi-join and a grouping aggregation. They utilise the emerging MapReduce paradigm, specifically a ...
متن کاملToward Parallel CFA with Datalog, MPI, and CUDA
We present our recent experience working to design parallel functional control-flow analysis (CFA) using an encoding in Datalog and underlying relational algebra implemented for SIMD coprocessors and supercomputers. Control-flow analysis statically models the possible propagations of data and control through a target program, finitely obtaining a bound on reachable expressions and environments ...
متن کامل